Practical selectivity estimation through adaptive sampling
نویسندگان
چکیده
منابع مشابه
The VC-Dimension of Queries and Selectivity Estimation Through Sampling
We develop a novel method, based on the statistical concept of the Vapnik-Chervonenkis dimension, to evaluate the selectivity (output cardinality) of SQL queries – a crucial step in optimizing the execution of large scale database and data-mining operations. The major theoretical contribution of this work, which is of independent interest, is an explicit bound to the VC-dimension of a range spa...
متن کاملSelectivity Estimation for Joins Using Systematic Sampling
We propose a new approach to the estimation of join selectivity. The technique, which we have called “systematic sampling”, is a novel variant of the sampling-based approach. Systematic sampling works as follows: Given a relation R of N tuples, with a join attribute that can be accessed in ascending/descending order via an index, if n is the number of tuples to be sampled from R, select a tuple...
متن کاملQuery Estimation by Adaptive Sampling
The ability to provide accurate and efficient result estimations of user queries is very important for the query optimizer in database systems. In this paper, we show that the traditional estimation techniques with data reduction points of view do not produce satisfiable estimation results if the query patterns are dynamically changing. We further show that to reduce query estimation error, ins...
متن کاملAdaptive Threshold Sampling and Estimation
Sampling is a fundamental problem in both computer science and statistics. A number of issues arise when designing a method based on sampling. These include statistical considerations such as constructing a good sampling design and ensuring there are good, tractable estimators for the quantities of interest as well as computational considerations such as designing fast algorithms for streaming ...
متن کاملThe VC-Dimension of SQL Queries and Selectivity Estimation through Sampling
In this work we show how Vapnik-Chervonenkis (VC) dimension, a fundamental result in statistical learning theory, can be used to evaluate the selectivity (output cardinality) of SQL queries, a core problem in large database management. The major theoretical contribution of this work, which is of independent interest, is an explicit bound to the VC-dimension of a range space defined by all possi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM SIGMOD Record
سال: 1990
ISSN: 0163-5808
DOI: 10.1145/93605.93611